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The phase of a single-mode field can be measured in a single-shot measurement by interfering 
the field with an effectively classical local oscillator of known phase. The standard technique is to 
have the local oscillator detuned from the system (heterodyne detection) so that it is sometimes 
in phase and sometimes in quadrature with the system over the course of the measurement. This 
enables both quadratures of the system to be measured, from which the phase can be estimated. 
One of us [H.M. Wiseman, Phys. Rev. Lett. 75, 4587 (1995)] has shown recently that it is possible 
to make a much better estimate of the phase by using an adaptive technique in which a resonant 
local oscillator has its phase adjusted by a feedback loop during the single-shot measurement. In 
Ref. [H.M. Wiseman and R.B. Killip, Phys. Rev. A 56, 944] we presented a semiclassical analysis 
of a particular adaptive scheme, which yielded asymptotic results for the phase variance of strong 
fields. In this paper we present an exact quantum mechanical treatment. This is necessary for 
calculating the phase variance for fields with small photon numbers, and also for considering figures 
of merit other than the phase variance. Our results show that an adaptive scheme is always superior 
to heterodyne detection as far as the variance is concerned. However the tails of the probability 
distribution are surprisingly high for this adaptive measurement, so that it does not always result 
in a smaller probability of error in phase-based optical communication. 
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I. INTRODUCTION 

In a typical textbook of quantum mechanics one might 
find a statement such as 

Every physical quantity Z has associated 
with it an Hcrmitian operator Z . A measure- 
ment of Z for a system with state matrix p 
will yield a result z which is an eigenvalue of 
Z . The probability of getting the result z is 
equal to (z|p|z) where Z\z) — z\z). 

Unfortunately the number of measurements of physical 
quantities for which this quantum measurement theory 
applies is very small. Nevertheless there are some in the 
context of quantum optics. It is only detector inefficien- 
cies (now quite small) which limit the measurement of 
the photon number with operator a* a and quadratures 
with operators such as X = a + a' for single-mode opti- 
cal fields. The former can be measured by direct photon 
counting and the latter by adding an essentially classical 
field of known phase (called the local oscillator) to the 
quantum field before counting photons (see for example 
Ref. §). 

There is one obvious optical quantity of which we can- 
not make a quantum-limited measurement: the phase <f> 
of the electromagnetic field. Despite the difficulties in 
defining a phase operator (which can be overcome (^|), 
the "phase eigenst ates" \(j>) are independent of any phase 
operator (see Sec. |II B[ ) and have been recognized for a 
very long time @. The opinion is sometimes expressed 
that the reason one cannot measure phase is that the 



phase eigenstates do not have (even approximately) com- 
pact support on the number states, so that a measure- 
ment of phase would require infinite energy. This argu- 
ment is specious, because the eigenstates of a + also 
do not have compact support on the energy eigenstates, 
and yet in the limit of infinite local oscillator strength 
and perfect photodetection a homodyne measurement 
approaches a quadrature measurement. Nevertheless it 
is true that phase cannot be measured exactly, even in 
these ideal limits. The reason for this will be explored in 
the discussion section. 

Although the quantum phase of a single mode field 
cannot be measured exactly, it can be measured approx- 
imately. As well as being interesting for theoretical rea- 
sons, there may be practical reasons for wishing to mea- 
sure phase. For example, quantum-limited communica- 
tion could be possible by encoding information in the 
phase of single- mode pulses of light. The first require- 
ment for such a scheme would be to create states with 
very well-defined phase. This has been investigated by 
various authors (see Ref. Q] for some of these). The 
next step would be encoding the signal, which is easy to 
do using an electro-optic modulator. The third require- 
ment is for the receiver to measure the encoded phase as 
accurately as possible. This is a problem which seems 
not to have received the amount of attention it deserves, 
given that it is as important to communication as the 
generation of states with well-defined phase. Another 
application for accurate phase measurements could be in 
inferring the properties of other quantum systems which 
can cause a phase shift, such as the presence of an atom 
at a particular point in a single-mode standing wave. 
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The standard way of measuring phase (approximately) 
is to use two simultaneous homodyne measurements of 
orthogonal quadratures (known as eight-port homodyne 
detection) , or heterodyne detection, which are equivalent 
in an appropriate limit jq]. A way to improve upon this 
was first suggested by one of us j| : single-shot adaptive 
measurements. By this we mean the use of measurement 
results from earlier stages of a single measurement to af- 
fect the conditions of the measurement in its later stages. 
In this case it means using the photocurrent up to time 
t to control the local oscillator phase at time t by a feed- 
back loop, during the detection of a single single-mode 
pulse. In Ref. |7| we investigated a particular feedback al- 
gorithm, illustrated in Fig. 1, using semiclassical theory. 
We showed that for large fields an adaptive measurement 
is a much closer approximation to a true phase measure- 
ment than is heterodyne detection. 

In this paper we continue our analysis of the sim- 
ple adaptive algorithm, but this time we present the 
full quantum theory of these adaptive phase measure- 
ments. The background theory required is presented 
in Sec. II. This introduces the theory of probability- 
operator-measures (POMs) which is required for approx- 
imate measurements. It also summarizes the theory of 
POMs for phase measurements and POMs for measure- 
ments using a large local oscillator. In Sec. Ill we de- 
rive expressions for the POMs for the two adaptive phase 
measurement schemes of Ref. In Sec. IV we use these 
POMs to calculate phase variances, for coherent states 
and for phase-optimized states with an upper bound on 
the photon number. We compare our exact (quantum) 
numerical results to the asymptotic (semiclassical) ana- 
lytical results obtained in Ref. . One feature which can 
only be calculated using the full quantum theory is the 
overall shape of the probability distributions, including 
the tails. This is required for determining the probability 
of error in phase communication schemes. This aspect is 
investigated in Sec. V, again for coherent states and for 
phase-optimized states with an upper bound on the pho- 
ton number. Sec. VI concludes with a discussion on the 
ultimate limits to phase measurements. 



II. PROBABILITY-OPERATOR-MEASURES 

A. General theory of POMs 

If (as in the present case) we are unconcerned about the 
fate of the system after it has been measured, then any 
measurement is completely described by the probability 
for each of the possible results to occur. Let the set of all 
possible measurement results A be denoted f2. Then the 
measurement is specified by a probability-measure (PM) 
on £1. If we denote the PM as P then for any subset 
E C f2, we can identify P(E) as the probability to ob- 
tain a measurement result A £ E. Of course this requires 
P(Q) = 1. 



For quantum mechanical systems, the most general 
way of generating a PM P is as the expectation value 
of an operator-measure F on Q. That is, for a quantum 
system with state matrix p, 



P(E)=Tr[pF(E)}. 



(2.1) 



Obviously F(E) must be a positive operator, and by con- 
servation of probability 



F(n) = 1. 



(2.2) 



For this reason we call F a Probability-operator-measure 
(POM), or sometimes an effect- valued measure |^,^|. 
Note that even for a subset E with a single element A, 
F{\) is not necessarily a projector. 



B. POMs for phase measurements 

Now consider the case where the measured quantity is 
to be a phase (f> of a single-mode photon field, so that F is 
a POM on il = [0, 2ir). Quantum mechanically this phase 
should in some sense be conjugate to the photon num- 
ber operator a^a, but as long as we stick with POMs to 
describe the measurement there are none of the difficul- 
ties associated with defining a phase operator @. Since 
phase is a continuous variable, we will use F(<f>) to denote 
the phase POM density. The completeness relation for a 
phase POM is therefore written as 



d<t>F{4>) 



1. 



(2.3) 



As explained in Ref. 0, for F{4>) to be invariant under 
phase shifts, and to be unbiased, implies that it can be 
written in the form 



-. oc 
Z7T £ — ' 



(2.4) 



Here H is a positive-semidefinite Hermitian matrix with 
all entries real and positive, and \m) is the number state 
a^a\m) = m\m) 



The completeness condition (2JJ) implies that 



V m > H„ 



1. 



(2.5) 



The positivity condition on the matrix H obviously re- 
quires that the off-diagonal elements be less than or equal 
to unity. A unique phase measurement is defined by spec- 
ifying that all of the off diagonal elements be equal to 
unity. This is what has recently been called a canoni- 
cal phase measurement ||, although its special role was 
recognized very early in the history of quantum theory 

In realistic phase measurements the off-diagonal ele- 
ments H m n will be less than unity, but for \m — n\ = 1 
and m 3> 1 they should be close to unity if the mea- 
surement is to be a good phase measurement, as will be 
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seen in Sec. IV. In fact, in all of the measurements we 
examine, we have 



h(m) = l-H m , m+l < 0(m-^ 2 ). 



(2.6) 



For a canonical measurement h( m) is identically zero. In 
this case we can write the POM (|2.4|) as 



(2.7) 



where \<j>) is an unnormalizcd phase eigenstate 

oo 

|0> = £e in *|n) (2.8) 



n=0 



as referred to in the introduction. 



C. POMs for dyne measurements 

We now turn from the POMs for phase measurements 
of a single-mode field to the POMs for measurements on 
a single-mode photon field made by interfering the light 
from that field with another field which has a macroscopic 
coherent excitation. This can be done at a beam splitter, 
and the two output fields of the beam splitter can then 
be detected by normal photodetectors. The second field 
can be treated classically as a c-number, and is known as 
a local oscillator. All practical phase-sensitive measure- 
ments require a local oscillator, to act as a phase refer- 
ence. If the local oscillator is resonant with the system 
field then this type of measurement is known as homodyne 
detection. If the local oscillator is detuned (outside the 
bandwidth of the system field) then this is known as het- 
erodyne detection. In considering phase measurements 
we will have to consider other sorts of measurements in- 
volving interference with a quasiclassical local oscillator. 
In ignorance of any received term for such measurements 
we will call them examples of dyne detection, so that 
homodyne and heterodyne are obviously special cases. 

Let us assume that our single-mode signal field has a 
temporal pulse-shape u(t) which is positive and normal- 
ized as 



u(t) = 1. 



(2.9) 



Here we are obviously ignoring the phase variation at op- 
tical frequency uj; u(t) is the envelope function. The total 
time T is necessarily much greater than so that the 
pulse can be considered monochromatic. This is essential 
in order for the dyne measurements (which are phase- 
sensitive measurements) to be quantum-limited. That 
is, for quantum effects to provide the limit to the phase 
uncertainty in the measurement. If the characteristic 
spectral width of the pulse T > T -1 is too large then 
the phase uncertainty will be dominated by the term 



5(f) ~ T/uo coming from the uncertainty T in the fre- 
quency In all that follows we assume this uncertainty 
to be negligible. 

For simplicity we will take the beam splitter at which 
the system and local oscillator fields are interfered to be 
balanced (50/50). Then, ignoring vacuum fluctuations, 
the two fields at the two output ports of the beamsplitter 
are equal to 



-_(t) = Vu~(tjj2 (a ± (3e^ 



(2.10) 



where a is the annihilation operator for the system and 
the real number (3 is the coherent amplitude of the local 
oscillator. This is normalized so that the instantaneous 
rate of photodetection at each detector is (b±(t)b±(t)) . 
We have assumed that the intensity-profile of the local 
oscillator is the same as that of the system. However, we 
have included an arbitrary phase variation <!>(£) of the 
local oscillator relative to the system. The total num- 
ber of photons in the local oscillator is /3 2 , so we are 
interested in the limit 1 3> 1, (at a). For homodyne de- 
tection $(t) = <I>o, a constant. For heterodyne detection 
<f>(i) = $o + tA, where A > T is the detuning. 

The signal of interest is simply the difference between 
the two photocurrents at the two detectors (labeled ±). 
If we denote the number of photocounts at each of the 
detectors in the time interval [t,t + 6) by 5N±(t) then we 
can define the signal photocurrent as 



lit) 



lim lim 

<5t->0 



SN+(t) - SN-(t) 
0St 



(2.11) 



Note that the two limits here do not commute. The limit 
[3 — ► oo implies that both photocounts will be dominated 
by the contribution from the local oscillator. The fact 
that the limit <5i — »• is taken second indicates that we 
are only interested in the fluctuations in I(t) on a time 
scale much greater than the mean time ~ u(t) [3~ 2 be- 
tween photodetections. 

The general quantum theory of dyne measurements 
was derived by one of us in Ref. [[[o| for the case where 
the system mode is derived from an exponentially decay- 
ing cavity so that u(t) = 7e~ 7 * where 7 is the cavity 
linewidth. This is easily generalized for arbitrary u(t). 
First we define a scaled time variable 



u(s)ds. 



(2.12) 



This is dimensionless, and increases monotonically with 
t from to 1. For the case u(t) = 7c -7 * we have 



v = 1 - 
so that 



-it 



The photocurrent in terms of v is scaled 



I[y)dv = I(t)dt = dvl(t)/u(t). 



(2.13) 



Now the measurement result for a dyne measurement 
up to time t is the complete photocurrent record I(t') 
from t' = to t' — t [or equivalently, I(v') from v' = to 
v' = v)]. This record is, in theory at least, a continuous 
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infinity of real numbers, which is an impractically huge 
amount of data. Fortunately it turns out that there are 
just two sufficient statistics at scaled time v (henceforth 
called simply time), namely the two complex numbers 



B v — — 



J(u)e**( u >du 



(2.14) 
(2.15) 



We call these the sufficient statistics because, as shown 
in Ref. pToj ] , the POM for the measurement at time 
< v < 1 is given by 

G V (A V , B v ) = Q V (A V ,B V ) exp(i£„a t2 + A v a?) 

x (1 - v) a ' a ' 2 exp(±£> 2 + Ala) , (2.16) 

where Q V (A V , B v ) is a positive function which will be 
defined shortly. This implies that the probability for ob- 
taining any photocurrent {I(u) : < u < v} is deter- 
mined only by the two complex functionals of this cur- 
rent A v and B v . Any other features of {I(u) : < u < v} 
are completely irrelevant. 

It might be thought that the second integral B v does 
not depend on {I(u) : < u < v} at all, becau se th e 
photocurrent does not appear explicitly in Eq. (2.15). 
However, it may appear implicitly if the local oscillator 
phase <&(v) depends upon {I(u) : < u < v}. This is 
precisely the situation we will consider later to construct 
a phase measurement. When we do so, the theory pre- 
sented here shows that $(v) should be made to depend 
on \I( u) : < u < v} only through the two integrals 



(2. 14), (2. 15). That is to say, we should have 
$(v)=f v (A v ,B v ) 



(2.17) 



for some (possibly time-dependent) function /. This is 
an extremely powerful result which is not at all intuitive. 

In the limit v -> 1, (1 - v) afa / 2 -> |0> <0| , where |0) 
is the vacuum state. So, drop ping t he subscript v when 
v = 1, we can write the POM ( 2.16 ) as 



G(A,B) = Q(A,B)^(A,B))(i>(A,B)\, (2.18) 
where \ip{A, B)) is an unnormalized ket defined by 



\i>{A,B)) =exp(i5at 2 + Aat)|0). 



(2.19) 



With a little operator algebra it is easyto show that this 
is proportional to the squeezed state lilt] 



\a, e) = exp(aa^ 
where 



a*a)exp(ie*a 2 -ea t2 )|0), (2.20) 



A + BA* 
-B&t&nh\B\ 

iii 



(2.21) 
(2.22) 



From Eq. ( |2.15| ), it is evident that \B\ < 1. For the 
schemes we will consider \ B\ < 1 with probability one, so 
that the two expressions (2.21), (2.22) are well-defined. 

If we rewrite the POM ( 2.16| ) in terms of a, e instead 
of A, B, we have 



G'(a,e) = Q'(a,e)\a,e)(a,e\, 



(2.23) 



where Q' is some new positive function of a, e. In this 
case the set of all measurement results is ft — C (g> C, 
where C denotes the set of complex numbers. If we imag- 
ine varying the state of the system \tp) (assumed pure), 
then the probability to obtain the result a, e is 



P(a,e) ex |(a,e|V)|' 



(2.24) 



Provided exp(|e|) <C \a\, the squeezed state \a,e) has a 



well-defined coherent amplitude a. Hence from Eq. (2.24) 
if the unknown system state \ip) is also localized in the 
phase plane, it is highly likely that it must have a coher- 
ent amplitude close to a. This fact will be used later to 
good effect. 

We must now address the issue of how Q(A, B) is 
found. In Ref. |l^] it is shown that Q(A, B) is the joint 
probability distribution that A, B would have if the pho- 
tocurrent I(v) were given by 



I{v)dv = dW{v) 



(2.25) 



where dW(v) is the infinitesimal increment in a real 
Wiener process |l^| satisfying 



(dW(v)) =0, 
dW(v)dW(v) = dt. 



(2.26) 
(2.27) 



In Ref. p0| , Q(A, B) was called the ostensible probabil- 
ity distribution for A, B. It is the probability distribution 
that A, B would have if there were no signal whatsoever; 
that is, if the system were prepared in the vacuum state. 
The noise in Eq. (2.25) then represents the local oscil- 
lator shot noise (or vacuum fluctuations if a Heisenberg 
picture interpretation is preferred). The presence of a 
non-zero signal determines the actual probability distri- 
bution through the POM fl2.1gj ). That is to say, if the 
system state matrix is p then the true probability density 



P(A, B)d 2 Ad 2 B = Q(A, B){i>{A, B)\p\i>(A, B))d 2 Ad 2 B. 

(2.28) 

Before moving onto specific examples in the follow- 
ing section, we will derive some general results regarding 
the ostensible distribution Q(A, B). First, the ostensible 
mean of A is 



l ^ v) dW{v)) = 0. 



(2.29) 
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This holds true even if &(v) depends on the photocur- 
rent record {I(u) : < u < v} because W(v) is a strictly 
Markovian process. Second, 

(A 2 ) = [ f {e l ^ +l ^dW{u)dW{v)) 
y Jo Jo 

^ dv (e 2t ^ v A = -{B) Q . (2.30) 
10 ^ ' 

Third, 



To find the ostensible statistics for A we treat I{y)dv 
as an independent Gaussian variable dW(v) for each in- 
finitesimal interval [v, v + dv). Since A is just the sum of 
these Gaussian variables, it mu st os te nsibly be a Gaus- 
sian variable itself. From Eqs. fl2.29| )-( |2~3l"l ) with B = 
it follows that the ostensible distribution for A is the 
rotationally-invariant Gaussian 



Q he \A)d 2 A = ir- 1 ex V {-\A\ 2 )d 2 A. 



(3.5) 



<w a >< 



i n i pi 

(dW(u)dW(v)) = dv = l. (2.31) 



o Jo 







From these results and Eq. ( 2.18 ) we find the POM for 



heterodyne measurements to be 



G hct (A) 



III. PHYSICALLY REALIZABLE PHASE 
MEASUREMENTS 

A. Heterodyne Measurements 



As noted in Sec. II B the ideal form of phase measure- 
ment is a can onical phase measurement in which H mn 
from Eq. ( |2.4|) is equal to unity for all m, n. This is plot- 
ted in Fig. 2(a). All physically realizable phase measure- 
ments fall short of this ideal. The simplest method for 
making a phase measurement is via heterodyne detection. 
As explained above, this involves a local oscillator which 
is far detuned from the system. The linear variation of 
the phase is in fact not essential; all that is required is 
that all relative phases (of the system with respect to 
the local oscillator) be sampled equally and on a time 
scale much shorter than the reciprocal bandwidth of the 
system. As long as there is a record of the local oscilla- 
tor phase as a function of time, the information in the 
photocurrent record can be recovered. For definiteness, 
however, we will take the local oscillator phase to simply 
change linearly with (scaled) time v. That is, 



(3.1) 



where A ^> 1. 

Having specified <5>(u) all that remains to completely 
describe this heterodyne measurement is to determine 
Q(A, B), the ostensible probability distribution for the 
measurement results A, B. Because the above $(t>) is 
independent of the photocurrent /, the 'result' B is a 
constant (rather than a random variable) with value 



B = — 



dvexp[2i($ + vA)} 



exp(2i$ 



1 - exp(2iA) 
2iA 



0, 



where the final limit results from taking A 
only variable in this case is therefore 



.4 



/ dvl(v) 
Jo 



exp[i($ + vA)]. 



(3.2) 

(3.3) 
The 

(3.4) 



exp(-\A\ 2 )\i,(A,0))(i;(A,0)l (3-6) 



Now from Eqs. (2.20)— (2.22) it is easy to verify that 
\ip(A, 0)) is simply proportional to the coherent state 
| A) where A is the coherent amplitude usually denoted 
a. It turns out that the proportionali ty f actor is just 
exp(|A| 2 /2) so that we can rewrite Eq. (3.6) as 



G hct (A)^TT- 1 \A)(A\ 



(3.7) 



This result has been obtained many times before by other 
means; for one example see Ref. |lj]. The factor of 7r _1 
remains because the coherent states are overcomplete. 

In the context of this paper we are interested in het- 
erodyne measurements only in so far as they enable us 
to make an estimate of the phase of the system. If t here 
is no prior information about the system then Eq. (3.7) 
suggests a good estimate of the phase to be 



!>hct 



= argA 



(3.8) 



The POM for this phase estimate is found simply by 
marginalizing the modulus of A. That is, 



F het (cj)) 



\A\d{\A\)G hct (\A\e 1 *) 



(3.9) 



Evaluating thi s in the number state basis yields the ma- 
trix H of Eq. dj|) to be 



p / n+m , i\ 



V n\m\ 



(3.10) 



Clearly H„„ = 1 as required, while the off-diagonal el- 
ements decrease with distance away from the diagonal. 
These features can be seen in the matrix plot of H~^n m 
Fig. 2(b). 
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B. Adaptive Measurements 



A = Bo - 0. (3.15) 



A heterodyne phase measurement is not as good as 
a canonical measurement because it is actually a mea- 
surement of both phase and amplitude, with the latter 
information being thrown away. In order to make a bet- 
ter phase measurement one would like to concentrate on 
measuring the phase quadrature. This can be done by ho- 
modyne detection , but only if one already knows the 
phase of the system. A true phase measurement should 
work even if one has no information about the system 
phase. Nevertheless we can use this idea to construct a 
true phase measurement as follows. Rather than mea- 
suring a fixed quadrature, we control the local oscillator 
phase as a function of time in order to measure the esti- 
mated phase quadrature. That is, we set <fr(v) to be equal 
to 



$(u) = <p{v) +tt/ 2 > 



(3.11) 



where ip(v) is the estimated phase of the system at time 
v. 

Two questions remain to be decided. First, given our 
measurement record {I{u) : < u < v} how do we decide 
(p{v)l Second, what do we choose to be our best estimate 
of phase <j> once the measurement is completed? We will 
postpone answering the second question. It was already 
noted above that the theory of dyne measurements im- 
plies that we should choose <p(v) — f v (A v ,B v ) for some 
function /. For the remainder of this paper we choose 



ip(v) = &rgA v 



(3.12) 



as in Ref. J7[. As outlined in that reference, the motiva- 
tions for this choice are: 

1. It is suggested by the above analysis for heterodyne 
detection. 

2. As shown by one of us it reproduces the canon- 
ical result if the system has at most one photon. 

3. It gives a feedback algorithm which would be easy 
to implement experimentally. 

4. It is mathematically tractable. 



When we say it can be exactly solved, we mean that 
we can determine the POM. To do this requires only 
the ostensible probability d istrib ut ion Q ad (A, B) given 
the feedback algorithm Eq. (|3 ll|) -( pT2| ). To find this it 
i s con v enien t to recast the ostensible integral equations 
(2. 14), (2. 15) as the ostensible Ito stochastic differential 
equations 



dA v = e 4 *' e W(»), 
dB v = e 2i ^ v Uv, 



(3.13) 
(3.14) 



with the initial conditions 



With the above feedback algorithm we have e 4 *^'^ = 
iA^/j^t,!. This gives 



dA, : 



iA v dW{v)/\A v \ 



(3.16) 



This can be solved by transforming to polar co-ordinates 
ip(v) = &rgA v and |A„| 2 . Using the Ito calculus we find 



d\A v \ 2 = dv, 
d(p(v)=dW(v)/\A v \ 



(3.17) 
(3.18) 



The first of these can be solved trivially to yield \A V \ = 
y/v. That is, the modulus of A evolves deterministically 
and in particular |A| = 1, as required by Eq. (2.31) Sub- 
stituting this into the second gives 



Jo 



(3.19) 



Here <p(0) is an arbitrary initial phase. It is irrelevant 
to the problem because the divergence at v = of the 
integrand in this equation means that the initial phase 
will be randomized immediately: 



(^ 2 )q = / dv l v = °°- 



(3.20) 



Thus the ostensible probability distribution for A is 

Ql d (A)d 2 A = S(\A\ - l)\A\d{\A\)±d(axgA). (3.21) 

We require the joint ostensible probability distribution 
Q ad (A, B). But rather than work with B v it is more con- 
venient to consider the variable 



-2i<p{v) 



e 2i ^ u) di 



It is easy to prove that for v = 1 
C = BA*/A, 



(3.22) 



(3.23) 



so that A, C can replace A, B as the sufficient stat istics . 
The adva ntage of the variable C v is that, from Eq. (3.22) 
and Eq. ( 3.19| ), it obeys the stochastic Ito differential 
equation 



dC v = 



2idW(v) 



2dv 

v 



C v + dv, 



(3.24) 



with the initial condition Co = 0. Since neither this ini- 
tial conditions nor the above differential equation involve 
the value of <£(0) (which is essentially random as noted 
above), the final value of C will be ostensibly indepen- 
dent of that of A. That is, 



Q* d (A,C) = QT(A)Qr(C) 



ad / 



(3.25) 



G 



In fact, given the above result Eq. (3.21) we need only 
ip = arg A so that 



Vid 



(A,C)d 2 Ad 2 C -> ^Qf (C)d 2 C. 

Z7T 



(3.26) 



The problem remaining is thus to find Q ad (C). It has 
not proven possible to find this analytically. However we 
have been able to find the exact values of the moments 



Ml 



/ /-in f~i*m\ 



(3.27) 



via a recurrence relation. This is done in Appendix A. 
For our purposes these moments are sufficient so we can 



assume the distribution Q C (C) known. From Eq. (2.1 



The P OM f or the results <p,C under the feedback algo- 
rithm CT|)-(53|) is thus 



G ad {<p,C)d<pd 2 C = |V(e l V 2 ^C))(V>( 
x d AsCQf{C). 



C)\ 
(3.28) 



Since the point of this exercise is to construct a phase 
measurement, we want ultimately to calculate some 
phase 4>aA{<p,C) from the sufficient statistics <p,C. We 
are not constrained to choose (p even though we have 
been using it as our estimated phase in the feedback 
loop. Therefore the general expression for the POM of 
our adaptive phase measurement is 



F ad ((f>) 
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dtp / / d 2 CG aA {^C)5{(l>-4> aA {<p,C)). 

(3.29) 



There are constraints on the function 4>ad{jf>, C). 
Clearly if the phase of the state p is rotated by some angle 
6, the probability distribution P a d(<^) = Tr[pF ac i(0)] for 
(f> should be shifted similarly. Now to rotate the phase of 
the state by 9 is equivalent to rotating that of the POM 
by -6. This has the effect of replacing |^(e^, e 2i *C)) by 

e~ t6a ' a \yl(e llii ,e 2 ^C)) = \iP(e^~ e \e 2t ^- e) C)) (3.30) 

Thus the distribution Pad(4>) will shift by the desired 
amount if and only if </>h c t is given by 



(3.31) 



for some arbitrary real function g of C. Furthermore, it 
can be shown that for H mn to be real and positive we 
need g(C*) = -g(C). 



1. Adaptive Mark I Measurements 

The simplest choice is g = 0. This corresponds to 

0i = <£ = arg A (3.32) 



That is, the phase estimate <p used in the feedback loop 
is also used as the final phase estimate. Wc call this the 
adaptive mark I measurement. In this case the POM is 



F\<t>) = / / d 2 CG ad {(/>,C). (3.33) 
d 2 CQ c {C)\^{e l<t >, e 2 ^C))(?Ke^, e 2l *C)\. 



This POM can be easily evaluated in the number state 
basis u sing the definition (2.19). The result is in the form 
of Eq. (2.4) with the matrix H given by 



|m/2j [n/2J 

H ln= E E (C* )% , (3.34) 

p=0 q=0 
|m/2j L«/2J 

= E E W m M^. (3.35) 

p=0 q=0 



Here L m /2J is the integer part of m/2 and 



2P(m- 2p)\pV 



(3.36) 



This is an exact expression since the moments M p - q can 
be calculated exactly. It is not obvious from this defini- 
tion H\ n = 1 for all n, but this can be verified computa- 
tionally. 

The matrix H] m is plotted in Fig. 2(c). It appears not 
greatly different from that for the heterodyne measure- 
ment. One difference is that H\ m = H\ rn for all m, and 
in particular that for n,m < 1, H\ m = 1. This is iden- 
tical to the canonical measurement and as good as pos- 
sible, as first revealed in Ref. [||. This result shows that 
for very weak fields the adaptive mark I measurement is 
significantly better than the standard heterodyne tech- 
nique. For moderate fields it is not significantly better 
(as Fig. 2 shows). As we will show later, for large fields 
it is very much worse. Evidently the adaptive mark I 
scheme is not the scheme we would choose for most prac- 
tical situations in which the photon number per pulse is 
very large. 



2. Adaptive Mark II Measurements 



A generally better result can be obtained by consid- 
ering a final phase measurements 0ad = + s(C) with 
g(C) ^ 0. Recall the result Eq. ( 2.24 ) obtained above, 



that the probability of obtaining a measurement result is 
proportional to the squared inner product of the system 
state with a squeezed state 



P(a,e) <x\(a,e\y))\ 2 



(3.37) 



Here a, e are defined in terms of A,B by Eqs. ( p. 21 ) 
We are interested in the case when the state lib'' 
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has a well-defined (but unknown) phase. Since any phys- 
ical state will have a finite mean photon number this 
means that must have a large coherent amplitude. As 
argued in Sec. II C, it is most likely that this coherent 
amplitude will be close to a. Now in terms of the vari- 
ables if, C we have 



e^(l + C) 



(3.38) 



1-|C| 2 

This suggests the mark II phase estimate 

4>u = arga = ip + arg(l + C). (3.39) 
That is, we choose the function g(C) so that 



J9(C) 



1 + C 
1 + C* 



(3.40) 



With this choice 
F n (cj)) = yy > d 2 CG ad (0-arg(l + C , ),C). (3.41) 
The H matrix is therefore 

|m/2j |n/2j 
Hmn = ^ ; y ] ^Jmpjnq x 

/ / I _i_ Q \ (n-m)/2 x 



1 + C* 



C P (C* 



(3.42) 



Unfortunately [(1 + C)/(l + C*)}^-" 1 ^ 2 is not a polyno- 
mial in C and C* so we cannot obtain an exact answer 
in terms of th e kn own moments M p,q . However from 
the definition ( ft.22| ) it is apparent that the modulus of 
the random variable C is strictly bounded by unity. In 
fact (C) Q = (C*) Q = (C*C) Q = 1/3, and all higher 
moments are smaller. Hence the MacLaurin series for 
[(1 + C) / '(i + C*)](™-™)/2 will converge rapidly and so can 
be well-approximated by a polynomial. Using an expan- 
sion to 100 terms, we have evaluated this POM matrix 
elements for n, m up to 100. 

The matrix H^ n for n, m up to 8 is shown in Fig. 2(d). 
From this it is apparent that the adaptive mark II scheme 
is generally much closer to a canonical measurement in 
this range than are either the heterodyne or adaptive 
mark I scheme. Indeed, all the matrix elements are above 
0.7, and all are greater than or equal to the heterodyne 
matrix elements. The only place where the adaptive 
mark II scheme is inferior to the adaptive mark I scheme 
is for very low photon numbers; < 1 unlike Hq^. 
We will show in the next section that the superiority of 
the mark II scheme over the other two schemes continues 
for large photon numbers, as quantified by the measured 
phase variance of various states. 



IV. PHASE VARIANCE 



A. Phase Variance and H„ 



Because phase is a cyclic variable, the definitions of 
mean and variance which apply to the real line are not ap- 
plicable. The sensible starting point for these two statis- 
tics for a cyclic variable with distribution P((f>) is 



The mean phase can then be defined to be 

(j> = arg^i, 
and the phase variance 

v = \n\-*-i. 



(4.1) 



(4.2) 



(4.3) 



It can easily be verified that these definitions go over to 
the usual ones appropriate for the real line when P((f>) 
is suitably localized (so that 1 — <C 1). There are of 
course other definitions of the variance in terms of 
which would also give the correct limit US The ad " 
vantage of the one presented here is that it can be used 
to derive an uncertainty relation 

4U > ((aWa) - (a + a) (a^a))' 1 , (4.4) 

as shown by Holevo ]lq ]. This inequality holds for the 
variance of any P(4>) arising from a phase measurement 
conforming to the definition in Sec. II B. 

Without loss of generality we can consider a system 
state 



71=0 



(4.5) 



with real number state amplitudes ip n so that it is guar- 
anteed to have a mean phase of zero. The probability 
distrib utio n from a phase measurement described by a 
POM (Oh with matrix H is 



P&) 



71,771—0 

For such a system we have 



1 oo 

\h ib e i4>{™-n)irr 

2 / 4 'rmyn^ 11 n 



y — 

^ 2tt 

71,771 — 
OO 



771+1—71 



(4.6) 

(4.7) 
(4.8) 



Thus the only part of H which contributes to the phase 
variance is the subdiagonal 



Hn+l,n = 1 - h(n). 



(4.9) 
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Although h ll (n) is not known exactly it was calculated 
to a very good approximation for n up to 100, as ex- 
plained above. For heterodyne detection and adaptive 
mark I detection we have exact results and for a canoni- 
cal phase measurement of course h can (n) = 0. For large 
photon numbers it is more useful to have approximate 
asymptotic expressions for h(n) for the three physically 
realizable schemes. These can be derived using semiclas- 
sical dyne detection theory j7| . The results are 



h hct (m) ~ (8m)- 1 + 0(m- 2 ), 
h l {m)~ (8m 1 / 2 )- 1 + 0(771-!), 
h u {m) ~ (16m 3 / 2 )- 1 + 0(m" 2 ). 



(4.10) 
(4.11) 
(4.12) 



As will be shown Sec. IVB and IVC this leads to a clear 
superiority of the adaptive mark II scheme over the het- 
erodyne scheme, and of the latter over the mark I scheme, 
for measuring the phase of states with large photon num- 
bers. Furthermore, it is shown at the end of App. B that 
the adaptive mark II scheme is the best scheme for m ca- 
suring large fields given the feedback algorithm (3.12). 



B. Coherent States 



1. Canonical 



A coherent state of mean phase equal to zero has co- 
efficients 



fin 



(4.13) 



i\> n = exp(-/3 2 /2) 



Thus for a canonical measurement we can use Eq. ( |4.8| ) 
with H mn = 1 to get 



fx = exp(-/3 2 ) 



(4.14) 



By expanding ^fn in a Taylor series about n = (3 2 while 
recognizing the moments of a Poisson distribution we ob- 
tain 



1 
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^ =1 -8^-128^ + °^ 



(4.15) 



Thus the variance from a canonical measurement of the 
phase of a coherent state is 



(4.16) 



This can be regarded as the intrinsic phase variance of a 
coherent state. In Fig. 3 we ha ve plo tted the exact result 
obtained n umer ically from Eq. (4.14), and the asymptotic 
result Eq. (4.16) for (3 from 1 to 5. The latter corresponds 
to a mean photon number of 25, which is evidently large 
enough for the asymptotic results to hold quite well. 



2. Heterodyne 

For het erod yne detection we can use the exact expres- 
sion Eq. (3.1C) to get 



H = f3exp(-f3 2 



~ r(n+|)/3 2 " 
^ T(n + 2)T(n + l)' 



(4.17) 



In terms of confluent hypergeometric functions, this is 

(4.18) 



// =/3exp(-/3 2 )^t) 1 F 1 (|;2;/3 2 ). 



T(2) 

Using the analogue to Euler's formula, 4.2(1) of JTq] 
asymptotic expansion 



(4.19) 



Thus the phase variance from a heterodyne measurement 
is 



V ^=w + w +o{0 ~ 6) - 



(4.20) 



To first (and almost to second) order this is twice 
that the canonical phase variance. The reason for this 
is apparent from the expression Eq. (3J3) for the hetero- 
dyne POM. The probability distribution for a heterodyne 
phase measurement is 



^ho°t h (0) 



\A\d(\A\)(P\F het (\A\e^)\P) (4.21) 



•dr\ (/3|re 10 ) | : 



(4.22) 



For (j) close to the mean value of the integrand will be 
strongly peaked at r ~ (3 ^> 1. Thus 



i^(0oc|</?|/3e*)| s 



(4.23) 



In other words, this distribution is approximately the 
convolution of the intrinsic phase distributions of two co- 
herent states of amplitude 0. Thus we expect the dis- 
tribution to be approximately Gaussian, with a variance 
double that of a canonical measurement. The exact re- 
sult from Eq. (4.18) and the asymptotic result Eq. ( 4.20| ) 
are plotted on Fig. 3. The excess phase noise in the het- 
erodyne result is because the measurement is not as good 
as the canonical result. In fact, we have 



T/het rrcan 

•coh *coh 



1 

I/3 2 



2/i hot (/3 2 ), 



(4.24) 



where h{m) i s the asymptotic expression for H m , m+ i — 1 



given in Eq. ( |4.10| ) . The quantity in Eq. ( 4.24 ) , which we 
will call th e ex cess phase variance, is plotted in Fig. 4. 
From Eq. ( |4.8| ) it follows that, for states with a well- 
defined coherent amplitude, the excess phase variance for 
any scheme is approximately 2h{0 2 ). 







3. Mark I Adaptive 



C. Phase-optimized states 



It was shown in Ref. (?]] that for a coherent state of 
amplitude (3 3> 1 the adaptive mark I phase (p can be ap- 
proximated by a Gaussian random variable of mean zero 
and variance 



coh 4/3 



+ 0(/3- 2 ). 



(4.25) 



This is plotted in Fig. 3 alo ng wi th the exact result cal- 
culated from Eqs. (LJ) and ( 3.34 ) truncated at n = 100. 
This result shows that the adaptive mark I is far worse 
than a heterodyne measurement for large (3. Indeed, to 
the order calculated, the phase variance is entirely due 
to the excess phase variance 



coh 



i/can 
*coh 



This was the result used to obtain 



M/3 2 ) = ^VU - VS 



w +0{r2) 



(4.26) 



(4.27) 



as rec orded above in Eq. ( 4.11 ). The asymptotic result 
( [4.26 ) and its exact value are plotted in Fig. 4. This 
shows that for small coherent states, with amplitude less 
than about 2, the mark I measurement introduces less ex- 
cess noise than the heterodyne measurement. For (3 — 5 
the asymptotic result is already a very good approxima- 
tion. 



From the coherent state results, the marked superiority 
of the adaptive mark II measurement over the standard 
techniques is apparent only from considering the excess 
phase variance. A more direct measure is the minimum 
phase variance for each measurement scheme. In this 
measure, the state is optimized for each scheme, and is 
subject to the constraint of having a maximum photon 
number N. That is to say we have to optimize the unit- 
norm real vector (?/>0j V'l; • • • ^Pn) so as to maximize 



N 

= ^n+lV'nEl - h(n)]. 
n=0 

This can be rewritten as 

N 

1 

A* 



1 N 



m.n— 



where 

J 77i n 



i[l - h{n)]S m , n+ i + |[1 - h(m)]5 n 



(4.30) 



(4.31) 



(4.32) 



The problem of maximizing fi thus reduces to that of 
finding the largest eigenvalue A max of the real symmet- 
ric matrix J. Since we have h(n) for all schemes up to 
n = 100 this can be done for a maximum photon number 
N up to 100. 

For the canonical case with h(m) = the eigenvalue 
can be found exactly to be 



N + 2 



(4.33) 



4- Adaptive Mark II 



For our final scheme we again used semiclassical tech- 
niques in Ref. Q to show that -P^W) was approximately 
Gaussian with a variance 



(4.28) 



Like the canonical result, this is dominated by the intrin- 
sic phase noise of the coherent state. This asym ptoti c 
result, and the exact result from Eqs. (4.8) and ( [3.42 ), 
are plotted in Fig. 3. The excess phase noise in this case 
is 



2M/3 2 ) 



V? 



coh 



v. 



can 
coh 



l 

8/?3 



0((3~ 



(4.29) 



which is far below that of the other two dyne schemes. 
This asymptotic result, and the exact excess phase vari- 
ance, are plotted in Fig. 4. Once again, the asymptotic 
behaviour is evident for (3 = 5. 



so that 



V^n = tan^ 



iV + 2 



^- 4 ^ + 0(^- 4 ). (4-34) 



For the dyne measurements there is no analytical solution 
but a numerical solution is easily obtained. The results 
are plotted in Fig. 5. This clearly shows the same order 
as established for coherent states with large photon num- 
bers: the adaptive mark II measurement is best, followed 
by heterodyne, followed by adaptive mark I. 

Also plotted in Fig. 5 are the asymptotic results for 
the three dyne measurements. These were obtained 
in Ref. _H using the asymptotic results for h(n) of 



Eqs. (4.1C )— (4.12). The results are most easily expressed 
by noting that these functions h(n) can all be written as 



h dync (n) = cn 



(4.35) 



for some positive power p > 1/2 and positive coefficient 
c of order unity. From this we got 

^mfn C ~ 2cN ~ P + (-^i)(2cp) 2 / 3 A- 2 ( 1+ rf/ 3 , (4.36) 

were z\ ~ —2.338 is the first zero of the Airy function. 
The leading term here is simply equal to 2h(N). This is 
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essentially the excess noise introduced by the measure- 
ment, just as 2/i dyne (/3 2 ) was for the coherent state. In 
this case the intrinsic noise (the second term) varies be- 
tween the different schemes because the state is optimized 
for each measurement. 

From Fig. 5 it is apparent that the exact numerical re- 
sults are approaching this asymptotic result for the het- 
erodyne and mark I measurements. However the mark II 
exact results are a long way from the asymptotic results 
even with TV = 100. This is actually not surprising. A 
simple calculation carried out in Ref. suggested that 
the asymptotic results would only become valid for 



v : a-,. = ( — 

2cp 



(4.37) 



For an adaptive mark I measurement we have iV as = 400; 
for heterodyne iV as = 4000; and for adaptive mark II 
Ngs ~ 3 x 10 7 . Evidently these requirements are overly- 
conservative (as mooted in our earlier paper). Neverthe- 
less, it does explain why the minimum adaptive mark II 
phase variance is a long way from reaching its asymp- 
tote for N = 100. This underlines the usefulness of the 
approximate asymptotic results. An exact numerical so- 
lution with N = 10 7 would be severely impractical. It 
also points out th e danger of trying to derive power laws 
such as Eq. (4.36) from numerical data for moderate pho- 
ton numbers of a few hundred, as done by DAriano and 
Paris in Ref. p7| . A detailed comparison with their re- 
sults for heterodyne detection for optimized states with a 
fixed mean photon number will appear in a future paper. 



V. PHASE PROBABILITY DISTRIBUTIONS 

A. P(4>) for coherent states 

Although the semiclassical theory of Ref. JjJ has proven 
invaluable for calculating the asymptotic phase variance 
for states of large photon number, it cannot readily yield 
the total phase distribution P{4>)- This is the quantity 
that is needed for a proper analysis of optical commu- 
nication based on encoding information in the phase of 
single-mode pulses. For a communication system there 
are certain phases which one would be expecting to re- 
ceive, so what matters is not the mean-square error in 
the phase measurement, but the probability for mistak- 
ing one phase for another. This depends on the total 
P(0), which requires knowledge of the full matrix H mn : 



P(4>) 



^ oo 



(m-n) tt 



(5.1) 



where p mn is the density matrix for the system state in 
the photon number basis. 

Before calculating probabilities of error it is informa- 
tive simply to plot P{<p) for the various schemes with the 



system in a coherent state. In Fig. 6 we plot logP co h(</>) 
versus 4> for various values of coherent amplitude (3. One 
thing is clear: the canonical P((f>) is best by any defini- 
tion. For small coherent amplitudes the adaptive mark 
I case is the best dyne measurement, and is almost in- 
distinguishable from the canonical measurement. As f3 
becomes larger the peak of P^ohi't') becomes sharper and 
taller than that of (</>)• The peak of P^ h {(f>) be- 
comes sharper and taller still, and for moderate (3 is in- 
distinguishable and that of P c C oh (</>)• All of the curves 
are inverted parabolas for small </>, indicating that the 
distributions P{(f) are approximately Gaussian. 

All of these features could be predicted from the above 
results. What is unexpected is the shape of the tails of 
the curves. First, as (3 increases, Pcohi'f 1 ) ceases to fall 
monotonically with distance from = 0, but suddenly 
reverses at <j> f=s 1 and has a broad local maximum at 
cj> = 7T. The heterodyne distribution has no such re- 
versal, but nevertheless levels out and approaches the 
canonical value at <f> — -k. The adaptive mark I case is 
also apparently smooth, but has much higher tails than 
the canonical heterodyne distributions. The big surprise 
is the adaptive mark II distribution. Like the canonical 
distribution it reverses (although smoothly) and has a 
broad local maximum at <fi = tt. But the value of -P^hM 
is actually the largest of all four schemes! In fact, for 
large /3, Pj'hM closely follows P^ohi^) until it reaches a 
floor, which is roughly the same as that of Pj oh (</>). 

These features are not easy to explain from the matrix 
elements H mn . For example, the ratio of the probability 
density at (j) = 7T to that at <p = is given by 



PM _ E_ff-(-i)'""f + 7VS! 



P(0) 



(5.2) 



Evidently this ratio depends crucially on the relative val- 
ues of the matrix elements H mn for m,n ~ (3 2 . In par- 
ticular, just because H^ n > P^ m Vm, n it does not fol- 
low that P a (ir) < P h {^)- That is, a measurement with a 
POM closer to the canonical POM, in the sense of having 
all elements of H mn closer to unity, does not guarantee 
an unambiguously better phase probability distribution. 



1. Heterodyne Measurements 

For heterodyne detection we can find an expression for 
P(ir) analytically. Recall that in this case the POM is 



G' coh (a)d 2 a = — \a)(a\d 2 a, 

7T 



(5.3) 



where \a) is a coherent state and the phase estimate is 
4> = argon. Clearly then the probability to obtain cf> — tt 
is 

1 



71" JO 



rdr\ {/3\-r) \, (5.4) 
rdr exp ( - (P + r) 2 ) . (5.5) 
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This integral can be evaluated in terms of the error func- 
tion, but for (3 3> 1 it is well approximated by 



adaptive mark I case, which is what is indeed seen. That 
is because 



phct 
coh 



4tt/3 ; 



■ exp(-/3 2 



(5.6) 



It can be verified from Fig. 6 that this is a very good 
approximation even for (3 = 5. For very large (3 the most 
important contribution is the exp(— /3 2 ) term. This scal- 
ing can be expressed as 



logP^TT)^-/^ 



(5.7) 



^ + arg(l + C), 



(5.15) 



and arg(l + C) lies between — ir/2 and ir/2. Thus irre- 
spective of C, a result (p « 7r in the tail of the distribution 
of the mark I measurement must also give a result </> in 
the tail of the mark II measurement. By this crude ar- 
gument we would also expect the log of the tail of the 
distribution of the mark II measurement to scale in the 
same way: 



2. Adaptive measurements 

For the adaptive measurements we can also determine 
P(tt) by returning to the POM 



G ad (<p,C) d<pd A C 



where 



dtp 
2^ 



d 2 CQ c (C) x 



|^(e^,e 2 ^C))(^(e^,e 2 ^C)|, (5.8) 



|^(e l V 2 ^C)) = exp(ie 2 ^ Ca f 2 + e i ^a + )|0>. (5.9) 

For a coherent state \j3) with (3 real the probability den- 
sity is 



Qc(C) 

2ir 
Qc(C) 
2tt 



IW(e 



icp 2i(p 



C))\ 2 



(5.10) 



exp (-/3 2 + Re[e 2 ^Cp 2 + 2e^j3]) 



Consider first the adaptive mark I scheme for which 
= (p. The ratio of P^tt) to P^O) is 



^iohW JJd 2 CP^,C) 



(5.11) 



^oh(O) Hd 2 CP^{0,C) 

_ J J d 2 CQ c (C) exp (-(3 2 + Re[C/3 2 ] - 2/3) 
" JJ d 2 CQ c {C) exp (~/3 2 + Re[C/3 2 ] + 2/3) 
= exp(-4/3). (5.12) 

Now since P c I o i 1 (0) is approximately Gaussian we have 
^coh(O) = (2< oh r 1/2 = (n/4(3)-^ 2 , so that 



^ohW-A/^expt-^). 



(5.13) 



This agrees excellently with the numerical result plotted 
in Fig. 6 for (3 — 5. For very large (3 the dominant term 
is obviously the exponential, which we can express by the 
equation 



logP c I oh ( 7 r)~-4 / 9. 



(5.14) 



For the adaptive mark II scheme we expect the tail 
of the distribution to be at least as high as that for the 



logP£ h (7r) ~ -4/3. 



(5.16) 



Clearly the relative disparity between the height of tails 
of the adaptive measurements and those of the hetero- 
dyne or canonical measurements will continue to increase 
as f3 increases. A discussion about the reason for this dis- 
parity is to be found in App. B. 



B. M— ary encoding with coherent states 



As stated above, one reason for wishing to know the 
complete phase probability distributions, including the 
tails, is for calculating the effectiveness of the various 
schemes for digital communication using phase encod- 
ing. The canonical and heterodyne POMs have been ex- 
amined before by Hall and Fuss |Q. Here we follow 
their approach, and consider M-ary encoding; that is, 
the transmission of data as the string of M-ary digits 
{0, 1, M — 1}. Each digit is represented by a rotated 
version of some single quantum state whose phase 
distribution is peaked about zero. The digit n is encoded 
as exp( 2 ^ L a^a)|'!/')- The receiver makes a phase mea- 
surement (as defined in Sec. IIB) on this state and infers 
from the result which digit was sent. That is, a result <p 
in the interval 2-nnjM ± ir/M is interpreted as the digit 
n. 

The essential measure of any mode of digital commu- 
nication is the probability that an error occurs. For each 
of the four measurement schemes we have calculated the 
minimal probability of error that may be achieved for 
each of two types of transmitted states. The first type is 
coherent states. These are important because, with the 
exception of squeezed states |ll[, they are perhaps the 
only pure single-mode quantum states that can be pro- 
duced readily enough to be considered for communication 
applications. 

Under the decoding scheme described above the prob- 
ability of error is independent of the digit encoded. For 
the zero state it is 



(•271-71- /M 

E= P{4>)d<j) 

Jtt/M 



(5.17) 
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It is easy to see that E is the expectation value of the 
positive operator Fe = 1 — Fc where 



Fa = 



oo • r / 
sm[7r(m 

n.m— 



i)/M] 



7r(m — n) 



H m , n \m)(n\. (5.18) 



Using this operator, the expansion of a coherent state 
in terms of number states, and the values of H m n for 
< m, n < 100 computed earlier, one may easily de- 
termine the probability of error for coherent states with 
small (3. 

We can find approximate asym ptotic analytic expres- 
sions for E by returning to Eq. ( |5. 17 ). The logarithm 
of E will be well approximated by the logarithm of the 
largest value of the integrand in Eq. ( 5.17| ). Since P{4>) 
for coherent states is approximately monotonically de- 
creasing from (f) = to (f> = 7r for all schemes, we can 
thus say 



logP coh ps logP coh (7r/M). 



(5.19) 



To proceed further we make the approximation that 
-Pcoh(0) is Gaussian until it hits the floor value P(ir). 
That is, 

logP coh (0) ps max{-0 2 /2U coh ,logP coh (^)}, (5.20) 



so that 

log E co h ps - min 



{2A^' l0gPc0h(7r; 



(5.21) 



From the results of Sec. IV B and Sec. V A we can eval 



uate this expression for the probability of error for the 
various schemes. 



logP c c a ^-/3 2 min{2(7r/M) 2 ,l} 
logPX«-/3 2 niin{(7r/Af) 2 ,l} 
log^ oh «~/3min{2(7r/Af) 2 ,4} 
log££ h ps ~/3min{2/3(VM) 2 ,4} 



(5.22) 
(5.23) 
(5.24) 
(5.25) 



As long as j3 > 2(M/ir) 2 we have the simple results that 
— log E scales quadratically with (3 for canonical and het- 
erodyne measurements, and linearly with (3 for the two 
adaptive measurements. For (3 < 2(M/tt) 2 the adaptive 
mark II measurement scales quadratically. 

From Fig. 6 it is evident that the approximation of 
P{4>) as a Gaussian plus a constant tail is poorest for 
the heterodyne measurement. Thus we would not expect 
the expression (5.23) to be particularly good. However 
for this measurement scheme we can find the following 
expression for E: 



TTlhct 

^coh 



-W-'f-y'dxdy (5.26) 



where a = cot(7r/M). After quite some effort this yields 
the asymptotic expression 



i g(p c n o t)^-/?7(i + ^) + i°g 
+ iog( / 3) + o(/r 1 ). 



a 



,2\5 



,10 s 



V^F(l + a 2 ) 9 / 2 



(5.27) 



The l eading term of this differs from the above result 
( |5.23 ) by at most 25% (for M = 3) a nd approaches it for 
large M. The full expr essio n (|5.27 ), and the above ap- 
proximate expressions (5. 22), (5. 24) and ( |5.25 ) are plot- 
ted as a function of (3 in Fig. 7 for M = 4. Also plotted 
are the exact numerical calculations of the probability 
of error. The expression (5.27) is evidently a very good 
approximation. The other analytical expressions match 
quite well the slopes of the curves, but are displaced ver- 
tically. For large (3 the slope is of course th e mor e im- 
portant feature, and it is interesting that Eq. ( 5.25| ) docs 
correctly predict the change from quadratic to linear be- 
haviour of logE^ h at {3 ps 2(4/tt) 2 ps 3.24. 

From the asymptotic results it is clear that for large 
(3 the adaptive mark II measurement has a higher prob- 
ability of error than heterodyne detection. Specifically, 
for M > 3 the cross-over point is at 



13 ps 4(M/tt) 2 



(5.28) 



For M = 4 this is (3 ps 6.48, which agrees well with the 
numerical data in Fig. 7. At this point the error is 



logP co hPs-16(M/7r) 



(5.29) 



Thus depending on whether the acceptable error level 
is less than or greater than this amount, the best dyne 
measurement scheme to use (in the sense of requiring 
the least energy Tiui(3 2 per pulse) will be heterodyne or 
adaptive mark II respectively. 



C. M— ary encoding with optimal states 

In this section we consider the probability of error for 
optimized states subject to a maximum-photon-number 
constraint. Since the probability of error is 



E 



\l-Fc\ip), 



(5.30) 



it is readily seen that the problem of finding the minimal 
probability of error for states of the form J2n=o c n\ n ) ^ s 
precisely that of finding the largest eigenvalue of the ma- 
trix formed by t runcating the number-state matrix for 
Fc of Eq. Q5.18 ). For small N this eigenvalue problem 
can be solved using MATLAB and the H mn matrices 
computed earlier. 

Figure 8 depicts the results for quaternary (M = 4) 
encoding. It is clear from this graph that the log of the 
E op t for optimized states has the same sort of dependence 
of the maximum photon number N as the log of P co h 
has on the mean photon number 1 . That is, for large 
N, the heterodyne and canonical measurements scale lin- 
early with N (with the latter having the greater slope) 
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while the adaptive measurements scale as the square root 
of N (with the adaptive mark II having the greater slope). 
Once again the adaptive mark II measurement is the 
best realizable measurement for moderate N, while the 
heterodyne measurement becomes superior for large N. 
We would expect the cross-over point to scale as M 4 , 
and for M = 4 the numerical data shows that it is at 
N w 64 w 25(M/tt) 4 . 



VI. DISCUSSION 



In this paper we have presented the exact quantum 
theory of two adaptive phase measurements. From this 
we have confirmed the semiclassical results obtained in 
Ref. Q. In particular, the phase variance from our adap- 
tive mark II phase measurement is always less than that 
from a standard phase measurement (such as heterodyne 
detection). We have also applied our theory to an area 
inaccessible to the semiclassical theory, that is the com- 
plete shape of the probability distribution for the mea- 
sured results <f>. We find that the adaptive measure- 
ment phase probability distributions have surprisingly 
high tails. This has the consequence that the adap- 
tive measurement is not necessarily better than standard 
phase measurements when it comes to communication 
using M-ary encoding of data in the phase of states. 

The fact that the adaptive phase measurement is not 
necessarily superior to the standard phase measurement 
for M-ary phase encoding does not mean that it is a poor 
phase measurement, or that adaptive measurements in 
general are not useful. After all the situation of Af-ary 
encoding does not really call for a phase measurement; 
rather it calls for a measurement which can distinguish 
as well as possible between a finite number of known dif- 
ferent (but not orthogonal) states. For the case of binary 
phase encoding using coherent states (with phases and 
7r), there is an adaptive measurement which has been 
known for some time Jl9| which distinguishes these pos- 
sible states as well as quantum mechanics allows. It is 
only when M ~ N, where N is the mean photon number 
of the states, that the measurement required is really a 
phase measurement. In this limit the variance of the dis- 
tribution is the important factor, and the adaptive mark 
II phase measurement always gives a lower error rate than 
standard detection. 

Although the asymptotics for the phase variance of 
the adaptive schemes were already known from the semi- 
classical theory of Ref. (?]] the quantum theory presented 
here sheds knew light on these results and allows us to 
probe new issues. For example, what is the ultimate 
limit on the phase noise introduced by an adaptive phase 
measurement? In other words, how closely is it possible 
to approximate a canonical phase measurement by us- 
ing a measurement involving dyne measurements (that 
is measurements using photodetection and a local oscil- 
lator with arbitrary time- varying phase)? Although we 



cannot answer this question at this stage, we can show 
that there is a lower bound on the amount of excess noise. 
This lower bound is not due to imperfections such as 
a finite local oscillator or inefficient detectors, but is a 
fundamental limitation of the method of measurement 
via photodetection. We proceed by using the analysis in 
App. B. 

It was shown in App. B that the probability for obtain- 
ing a particular phase <fr is determined largely by the max- 
imum overlap between the system state and any of the 
pure states which contribute to the probability operator 
F((f>) for that phase. For dyne measurements, these pure 
states are squeezed states. As a result of this, the vari- 
ance of the measured phase probability distribution will 
be (to a good approximation) equal to the true (canoni- 
cal) phase variance of the system plus the phase variance 
of the maximum-overlap pure state. Furthermore, it was 
shown in App. B that in order to obtain a large overlap, 
the maximum-overlap squeezed state must have a well- 
defined coherent amplitude roughly equal to the coherent 
amplitude of the system. 

From these considerations we can conclude that if the 
system has roughly N photons, then the excess phase 
variance will be approximately that of a squeezed state 
with a mean photon number of N . Now the minimum 
(canonical) phase variance of a squeezed state with a 
mean photon number of N has been investigated by Col- 
lett pfj||, who found the asymptotic result 



> 



log A 
AN 2 ■ 



(6.1) 



This represents a lower bound on the excess phase vari- 
ance introduced by any dyne measurement. So, for exam- 
ple, if N is sufficiently large then the minimum measured 
phase variance for a state with at most N photons would 
be 



dync log A 

min - 4 jy 2 • 



(6.2) 



This lower bound should is a long way below the vari- 
ance achieved by the adaptive mark II scheme presented 
here, for which 



1 



8A 3 / 2 



(6.3) 



which itself is a long way below the the variance achieved 
by standard measurements, namely 



V, 



het 
min 



l 

In' 



(6.4) 



In fact, the lower bound (S.2) is very close to the absolute 
lower limit set by canonical measurement pjl] 



V, 



can 
min 



7T 

N*' 



(6.5) 



Exactly how close one can come to the lower bound (6.1) 
by using a different feedback algorithm is a matter for 
future research. 
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APPENDIX A: THE OSTENSIBLE MOMENTS 
OF C 

Following the text, we denote the ostensible moments 
of C as 

iic™ = (c:c: m ) Q . (ai) 

Using the rules of Ito calculus to evaluate 

dM^ m = ((c v + dc v ) n (c: + dc* v r - qci (A2) 



we find from Eq. (3.24) 



dM2> 
dv 



(A3) 

Since M„ 0,0 = 1 these equations may be solved recursively 
to find, 



M r ' 



nM r - 



mM r - 



2(n — to) 2 + n + m 



(A4) 



Recall that by convention M n ' m — Af™' m . For n or m 
For norm equal to zero this recurrence relation can be 
solved to get 



M n,0 = M 0,n 



1 



1 



(2n + l)(2n- 1) . . . 1 (2n + l)!!' 

(A5) 

These boundary values allow us to rapidly compute all 
the desired moments M n,m . 



APPENDIX B: THE TAILS OF THE 
DISTRIBUTIONS 

The reason for the different scaling of the tails of 
the adaptive measurements compared to the heterodyne 
measurement can be understood as follows. For hetero- 
dyne detection the dominant term is the inner product 
of the system state f3 with the coherent state \—r) for 
r = CP. This maximizes the overlap while still maintain- 
ing <j) = argr = tt: 



logP c ^(0)~log|{/3|0) 



(Bl) 



For the adaptive mark I technique the overlap will be 
with a squeezed state \a,e), where (using (f> = (p = tt) 



l + C 

"l-|C| a 
Catanh|C| 



(B2) 
(B3) 



The problem is to determine the value of C which maxi- 
mizes this overlap. 

It is not difficult to see that the value of C we seek will 
be real and positive. In this case 



a = -(l-C)- 1 
e = — atanhC 



(B4) 
(B5) 



This describes a squeezed state centred at x = —2/(1— C) 
with an x-variance 



exp(— 2e) 



l + C 



l-C 

The overlap between \(5) and \a, e) is 

2 exp [— (1 + tanhe)(/3 + a) 



a, e) 



coshe 



exp 



Vi -c 2 



(B6) 

(B7) 
(B8) 



Ignoring the negligible vl — C 2 , this expression is max- 
imized for 



l-C = [3-\ 



(B9) 



This implies a = —j3 and exp(— 2e) ~ 2/3. Substituting 
this in gives 



logPi h (7r)~log|(/3|a,e) | 2 — -4/3, 



(BIO) 



as obtained in the body of the paper. 

This derivation in the appendix shows that the reason 
for the high tails of the adaptive distributions is the large 
a;- variance of the squeezed state \a, e), giving it a much 
larger overlap with \(3) than has |0) (from the heterodyne 
measurement). Although this large squeezing is respon- 
sible for the high tails, it is also what allows the narrow 
peak of the adaptive mark II measurement. This can be 
seen as follows. 

The most likely result for the adaptive mark II case is 
<j> = (p + arg(l + C) = 0. This is obviously most likely to 
occur for = 0, in which case the only difference is that 



l + C 
L — |C| 2 



(Bll) 



One again it is easy to see that the maximum overlap 
will be for C w 1. The overlap in this case is 



lop 



a, e) 



-(l-C) /3 



l-C 



(B12) 
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This is maximized (with a value of zero) at exactly the 
same C = 1 — /J -1 . This gives a = f3 as expected, and 
the same x- variance. 

In this case what is of more interest is the y-variance 

exp(2e) ~ (2/3)- 1 . (B13) 

The intrinsic phase variance of this squeezed state is thus 

(y 2 ) _ cxp(2e) 1 



( x f (2/3)2 8/33- 



(B14) 



This is precisely equal to the asymptotic expression for 
the excess variance 



V 1 

* nt 



V 

' r 



can 
h 



1 



(B15) 



The reason for this is that the measured phase distribu- 
tion is at least as wide as a convolution of the true (canon- 
ical) phase distribution of the state with the true phase 
distribution of the most likely POM. This is co mple tely 
analogous to the argument centred around Eq. (4.23) for 
the heterodyne case. For the adaptive mark I measure- 
ment the measured distribution is actually much wider, 
but the above calculation shows that for the adaptive 
mark II measurement all of the introduced noise is due 
to the quantum uncertainty in the states making up the 
POM. Thus the mark II phase estimate is, for large fields, 
t he b est possible estimate given the feedback algorithm 

(B). 
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FIG. 1. Diagram for the experimental apparatus for mak- 
ing an adaptive phase measurement. Thin dashed lines indi- 
cate light rays and the thin continuous line labeled BS rep- 
resents a 50/50 beam splitter. Medium lines represent elec- 
tro-optic devices: photodetectors (PD) and an electro-optic 
phase modulator (EOM). Thick lines represent electrical com- 
ponents: a subtractor, a multiplier, an integrator, a signal 
generator (SG), a signal processor, and a digital read out giv- 
ing the measured value of (j> 6 [0, 2tt). The necessity for these 
particular electrical elements alone is a consequence of the 
feedback algorithm explained in Sec. Ill B. 



FIG. 2. Plot of the H matrix which defines the POM for 



phase measurements as in Eq. 2.4, for the four schemes (a) 
canonical, (b) heterodyne, (c) adaptive mark I, and (d) adap- 
tive mark II. 



FIG. 3. Plot of the exact (points) and asymptotic (lines) 
expressions for the phase variance V co h of a coherent state of 
amplitude f3 versus j3 under the four schemes: canonical (* 
and solid line); heterodyne (o and dotted line); adaptive mark 
I (+ and dash-dot line); and adaptive mark II (x and dashed 
line) . 



FIG. 4. Plot of the exact (points) and asymptotic (lines) 
expressions for the excess phase variance V co h — V^Jf of a co- 
herent state of amplitude /3 versus /3 under the three dyne 
schemes: heterodyne (o and dotted line); adaptive mark I (+ 
and dash-dot line); and adaptive mark II (x and dashed line). 
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FIG. 5. Plot of the exact (points) and asymptotic (lines) 
expressions for the minimum phase variance V m i n of the opti- 
mal state with at most N photons versus N + l under the four 
schemes: canonical (* and solid line); heterodyne (o and dot- 
ted line); adaptive mark I (+ and dash-dot line); and adaptive 
mark II ( x and dashed line) . 

FIG. 6. Plot of the exact expressions for the log of the 
probability distribution P coh (4>) for coherent states under the 
four schemes: canonical (solid line); heterodyne (dotted line); 
adaptive mark I (dash-dot line); and adaptive mark II (dashed 
line). The coherent amplitude is (a) (3 = 1, (b) (3 = 2, (c) 
(3 = 3.5, (d) 13 = 5. 



FIG. 7. Plot of the exact (points) and asymptotic (lines) 
expressions for the log of the probability of error E co h for 
quaternary phase encoding using coherent states of ampli- 
tude f) versus (3 under the four schemes: canonical (* and 
solid line); heterodyne (o and dotted line); adaptive mark I 
(+ and dash-dot line); and adaptive mark II (x and dashed 
line) . 

FIG. 8. Plot of the exact (points) expressions for the log of 
the minimum probability of error -E co h for quaternary phase 
encoding using the optimal state with at most N photons ver- 
sus N under the four schemes: canonical (*); heterodyne (o); 
adaptive mark I (+); and 
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